Reconstructing the Hopfield network as an inverse Ising problem 



Haiping Huang 

Key Laboratory of Frontiers in Theoretical Physics, Institute of Theoretical Physics, 
Chinese Academy of Sciences, Beijing 100190, China 
(Dated: December 14, 2009) 

We test four fast mean field type algorithms on Hopfield networks as an inverse Ising problem. 
The equilibrium behavior of Hopfield networks is simulated through Glauber dynamics. In the low 
temperature regime, the simulated annealing technique is adopted. Although performances of these 
^\ , network reconstruction algorithms on the simulated network of spiking neurons are extensively stud- 

■ ied recently, the analysis of Hopfield networks is lacking so far. For the Hopfield network, we found 

' that, in the retrieval phase favored when the network wants to memory one of stored patterns, all 

, the reconstruction algorithms fail to extract interactions within a desired accuracy, and the same 

^ . failure occurs in the spin glass phase where spurious minima show up, while in the paramagnetic 

^ ' phase, albeit unfavored during the retrieval dynamics, the algorithms work well to reconstruct the 

I network itself. This implies that, as a inverse problem, the paramagnetic phase is conversely useful 

for reconstructing the network while the retrieval phase loses all the information about interactions 
in the network except for the case where only one pattern is stored. The performances of algorithms 
are studied with respect to the system size, memory load and temperature, sample-to-sample fluc- 
tuations are also considered. 



C/3 



o 



o 



PACS numbers: 02.30.Zz, 02.50.Tt, 75.10.Nr, 84.35. 



I. INTRODUCTION 



' Last years have witnessed a surge of research interests in the inverse Ising problem [l|, 0, H, 0, [E H S H, @ , also 
known as Boltzmann machine learning in statistical inference theory [lol [ll| . This is due to on one hand the fact that a 
\^ • huge amount of data can be collected from many biological systems such as neural networks, gene regulatory networks 
and metabolic networks [H, [H, [Hj EBl , on the other hand to the growing need for novel and efficient algorithms 
to reconstruct the network based on the huge amount of experimental data. Individual elements (e.g., neurons, genes, 
^ [ even computers in the internet) in the network usually interact with each other to yield the collective behavior emerging 
'— 'i at the network level. To extract functional connectivity from the collective behavior of the network, we use {cri}^i 
' to represent the activity (e.g., electric activity of single neuron, expression level of single gene) of each element in a 

^ ■ network with size N, then the likelihood of each state cr is assumed to be -PismgC"") oc exp X)i<j Jij'^i'^j + X^i ^i'^i j 

also named the second-order maximum entropy model studied in Refs. [ip, [3, UBl- {hi,Jij} serve as Lagrange 
multipliers corresponding to the constraints given by {mi,Cy } where rui is the magnetization and two-point 
connected correlation between sites i and j in the statistical physics language. From the experimental data (e.g., 
microarray data or multi electrode recordings), one can measure both the mean activity (m;) of each individual and 
pairwise correlations (Cy ) among them. The inverse Ising problem is to infer the underlying parameters {/li, Jy} 
Qv^ from the knowledge of measured {mi, Cy}, such that the resulting Ising distribution is able to provide an accurate 

, description of the statistics of the experimental data, i.e., (o'i)ising = (o'j)data ' (''■«'^i)ising ^ ('''*^j)data- 
^ ' The pairwise Ising model has been extensively studied as an inverse Ising problem on retinal networks p^ . [T5 | , 
• ^ . cortical networks [l4| and gene regulatory networks [lB|- In Ref. [lB|, it was observed that the maximum entropy 
' principle can be used to extract information about gene interactions and the result reproduces the observed transcript 
^ \ profiles with high fidelity. Schneidman et al also showed that the pairwise Ising model captures ^ 90% of the 
- - I correlation structure of the retinal network activity. On top of the existing spatial correlations among neurons, the 
temporal dependencies were also suggested to be a common feature of cortical networks [l3|. Recently, Marre et al 
employed the same maximum entropy principle with a Markovian assumption to predict the occurrence probabilities 
of spatiotemporal patterns and the result is significantly better than that obtained by Ising models only considering 
spatial correlations. In aspects of theoretical analysis, Roudi et al 0, 0| have investigated the dependence of the 
fit quality of pairwise model upon the time bin size as well as the system size and found that the pairwise model 
always provides an accurate statistical description of spikes as long as the system size does not exceed the critical size 
determined by the mean population firing rate and the bin size. From the algorithmic perspective, Broderick et al 
combined a coordinate descent algorithm with an adaption of the histogram Monte Carlo method to solve the inverse 
problem efficiently up to iV = 40 neurons. Subsequently, Mezard and Mora introduced the message passing ideas 
to the inverse Ising problem and proposed the susceptibility propagation as a comprehensive network reconstruction 
algorithm for sparse network or the network with sufficiently weak interactions 3]. The susceptibility propagation 



together with Sessak-Monasson approximation (SM) recently put forward in Ref. [J], has been tested in Sherrington- 
Kirkpatrick model \17^, and it was found that the message passing based method as well as SM outperforms other 
existing mean field schemes and SM was shown to be more efficient. The message passing technique also found 
application in the inference of gene regulatory networks, and a statistical mechanics analysis has been presented in 
Ref. [l8| . Roudi et al in a recent work Q studied the inverse problem on a simulated network of spiking neurons 
and it was observed that as the network size increases, SM and the inversion of Thouless- Anderson-Palmer (TAP) 
equations outperform other mean field type algorithms to predict the network structure yet the fit quality degrades 
as the system size grows. 

In this paper we use the same four mean field schemes employed in Ref. Q to test their performances on both the 
fully-connected and finite connectivity Hopfield network and investigate the behavior of these inference algorithms 
with respect to the system size, the memory load and particularly different temperatures. It is shown that the 
paramagnetic phase helps to extract the information about couplings in the network, although the recall phase is 
in turn useful during the retrieval process of one of embedded patterns, and in this case the system is prevented 
from entering the paramagnetic or spin glass phase. The naive mean field method (nMF) and TAP exhibit very good 
performances while independent-pair approximation (ind) shows a relatively high inference error. In the paramagnetic 
phase, SM leads to the nearly identical performances with nMF and TAP, and their performances deteriorate with 
increasing network size. Given the fixed system size, all the algorithms show increasing inference errors as the memory 
load increases. In the low temperature region, SM shows relatively high inference errors and is even inferior to ind for 
certain memory loads. The sample-to-sample fluctuations are also taken into account, and we found the inference error 
shows large fluctuations as the temperature decreases. The finite connectivity Hopfield network is also analyzed, as 
expected, ind performs well especially for the small number of stored patterns or at low temperature. In addition, when 
the temperature decreases, the reconstruction algorithms show similar behaviors as observed in the fully-connected 
network. 

The remainder of this paper is organized as follows. The Hopfield model and Glauber dynamics (GD) are introduced 
in Sec. [TTl In Sec. 1111) four mean field schemes, nMF, ind, SM and TAP, are presented briefly. The reconstruction 
performances of these algorithms are reported in Sec. IIVI for the fully-connected network and the finite connectivity 
network respectively. We conclude with our results and future perspectives in Sec. [V] 



II. HOPFIELD NETWORKS AND GLAUBER DYNAMICS 

The Hopfield network, proposed in Ref. [l9j, later thoroughly discussed in Refs. [l^jHH, functions as an associative 
memory network. It is in essence a recurrent network, and its equilibrium properties are determined by the following 
Hamiltonian: 

--'^Jrj(Ti(Tj (1) 

i¥=3 

where Ui represents the state of each neuron in the network. = +1 indicates the neuron i generates a spike while 
Ui = —1 keeps quiescent. Interactions between neurons are constructed according to the Hebb's rule, 

-^--^E^r^." (2) 

where {Cf } taking ±1 with equal probability ^ are P stored patterns. The number of patterns P scales as P = 
aN in the fully-connected case. The Hebb's rule expresses the multiplicative interaction between presynaptic and 
postsynaptic activity and positively correlation (both neurons are on or off) causes an enhanced coupling while 
negatively correlation results in a decreased one. Under the Hebb's rule, the Hamiltonian equation ([1]) can be 
actually rewritten in terms of the overlap between the network configuration and the stored patterns. This guarantees 
that the energy function H always decreases while the system evolves according to the following GD rule. 

The fully-connected Hopfield network requires complete and symmetric connectivity, also no self-interactions. Mean 
field behavior of finite connectivity Hopfield network has been recently studied in Refs. [13, H^. The difference is that 
in the sparse network, the coupling is constructed as follows. 



(3) 
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where I is the mean degree of each neuron. When N oo, P = al. We also assume that no self-interactions are 
present in the finite connectivity Hopfield model and the connectivity lij is symmetric and subject to the distribution 

Pih,) - (1 - j;^)S{h,) + -J^m, 1) (4) 

Taking the thermal fluctuation into account, the GD rule [1^ is specified as P{ai —cFi) — i+e:xp(0An ) '^tiere 
AHi stays for the energy change due to such a flip. Equivalently, the dynamics rule can be recast into [25i] . 

P((Ti -<Ti) = i [1 - (Ti tanh/3/ij] (5) 

where the inverse temperature (3 serves as a measure of degree of stochasticity and hi — "Ylij^i Jij'^j the local field 
acting on Ui. Under this dynamics rule, if the initial configuration is close enough to one of embedded patterns, it 
will tend to evolve towards the nearest attractor represented by one single pattern in the configuration space. That 
is also the meaning of associative memory. 

In this work, we use GD to detect the equilibrium properties of Hopfield networks, then the numerical simulation data 
is used to reconstruct the network. We shall keep to the small size system up to TV = 190 due to the computational cost. 
In the thermodynamic limit, the phase diagram of fully-connected Hopfield network has been studied in Refs. [l^, [2l| 
while that of sparse network in Refs. [23,[23|. In the numerical simulation, two types GD will be implemented. Both 
types are run in a randomly asynchronous manner. The type A is executed as follows. We do the standard GD 
as demonstrated in equation ([5]) totally 4 x 10^ steps (each step corresponds to the process where the state of each 
neuron is updated on average one time), among which the first 2 x 10^ steps are run for the system to reach the 
equilibrium state and the other 2 x 10^ steps for calculating magnetizations and correlations. We sample the state of 
the network every 200 steps. To get around the difficulty that the system is apt to get stuck in local minima of the 
free energy landscape at low temperature, the simulated annealing technique [l7| is introduced in type B GD where 
we set the initial temperature to be 1.0 and the cooling rate 0.005. At each intermediate temperature, we run GD 
10^ steps. When the temperature is decreased to the desired one, we run another 2 x 10^ steps to sample the system. 
The smaller cooling rate and larger steps run at each intermediate temperature are favored but the corresponding 
computational cost increases. 



III. MEAN FIELD SCHEMES FOR INFERRING COUPLINGS 

Boltzmann machine learning works also as a network reconstruction algorithm [Tol.[Tl|. It adjusts couplings at each 
iteration step according to the difference between measured correlation and that produced by the prescribed Ising 
distribution. The algorithm is iterated until the difference falls within the desired accuracy. The Boltzmann machine 
learning is exact but also computationally expensive since a large amount of Monte Carlo sampling steps are required. 
In this section, we will briefiy introduce for the Hopfield network reconstruction four fast mean field approximations 
presented in Ref. @]. 

A. Naive Mean-Field Method 

The naive mean field theory indicates that rrii = tanh ^/i^ -I- Jifemfe^ where rrii = {(Ti). To calculate the 
connected correlation Cy = {(JiUj) — mirrij, we use the fluctuation- response relation, 



(6) 



then obtain the nMF prediction of couplings. 



J-^F = {p-% - {C-% (7) 

where = (1 — ■ml)5ij. Note that the predicted here has been multiplied by /?, therefore the actual predicted 
one Jfj = Jij/p. 
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B. Independent-Pair Approximation 



This approximation assumes that each pair of neurons arc independent of the rest part of the system, i.e., their 



joint probabihty P{ai,aj) oc exp 



Jij ^ j 



where h^P{h^j^) is the local field neuron feels when 



neuron is removed from the system. Then the ind prediction is given by 



J" 



(1-C7:,)2-K; 



(8) 



where C,-, 



Cij + rriimj. 



C. Sessak-Monasson Approximation 

In a recent work by Sessak and Monasson [3| , the SM prediction of couplings is derived using a systematic small 
correlation expansion. 



tSM _ tIoop I Tind ^jd (Q\ 

^ {1 - m^){l - m^A - Cl ^ ' 



where J™'* is given by equation ([8]) and = [LiLj) [M(I + M) ^]^^. where I is an identity matrix, Li 

1 - mf ,M„ = Cjj(LjLj)-i/2 and M,, = 0. 



D. Inversion of TAP Equations 



The usual TAP equation reads hi = tanh ^ — Ej^^i '^ij^'^j + ^'^i S^^i ^iji^ ~ '^^) O- Taking the derivative of 



the field hi with respect to the magnetization rrij, one readily obtains the TAP prediction equation. 



(c-i) 



" dm., ' -^'^ 



TAP 



(10) 



which has been introduced by Kappen and Rodriguez [26] and Tanaka [23]. 

To measure the reconstruction performance of these algorithms, we define the inference error as 



N(N ^l) 



prcd _ jtruo-i2 



1/2 



(11) 



where J^^'^^ is the prediction value of the coupling based on the above four reconstruction algorithms and J*™® the 
original coupling constructed through the Hebb's rule equation ^ or equation ([3|). Obviously, the smaller Jg is, the 
more precisely the pairwise Ising model reproduces the statistics of the experimental data. 



IV. RESULTS AND DISCUSSIONS 



We restrict our analysis to the small size system, although the phase diagram of the Hopfield model was derived 
in the thermodynamic limit [20l. [2ll . [2^ . [23l ] and the result for the small size system will be slightly different due to 
the finite size effects if N is not very small, it is expected that the recall phase, paramagnetic or spin glass phase will 
still appear as T and a vary. They can be characterized by two order parameters, the overlap between the network 
configuration and the /x-th pattern — jj ^i'^i^ ^"^^ mean squared magnetization q = jf X^i^i Scatter 

plots comparing the inferred couplings with the true ones for the fully-connected network are presented in Fig. [T] 
nMF, TAP and SM give very good inferences of couplings while ind shows a relatively high error. The part below the 
line underestimates the true couplings and the part above overestimates the true ones. As shown in Fig. [1] nMF and 
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FIG. 1: (Color online) Scatter plots comparing the inferred couplings with the true ones for /3 = 1.0, a = 0.1, A'^ = 110. Results 
from two samples are shown respectively. The full line indicates equality and the network is fully-connected, (a) nMF, (b) ind, 
(c) SM and (d) TAP. 



TAP show a similar behavior of estimating the true coupUngs. They always underestimate the true negative coupUngs 
but overestimate the positive ones. However, SM behaves conversely. As expected, in this fully-connected network, 
the independent pair approximation fails to recover the couplings within a desired accuracy. The performances versus 
the system size are illustrated in Fig. [2] (a). It turns out that the inference errors increase with the network size 
for nMF, SM and TAP in the paramagnetic phase. Additionally, SM performs better than nMF and TAP. The bad 
predictor ind shows large fluctuations from sample to sample. Fig. [2] (b) presents the reconstruction performance 
versus the memory load at the low temperature. SM behaves badly with large fluctuations, even inferior to ind for 
certain memory loads. However, it should be noted that when just one single pattern is stored, all mean field schemes 
perform relatively better and give /e ~ 0.03. It is due to the fact that there exists only one attraction for the retrieval 
dynamics, therefore it is easy to detect the equilibrium state of the network which contains all the information about 
the couplings. The reconstruction performances against the memory load in the paramagnetic phase are shown in 
Fig. [2](c). As the memory load increases, the algorithms behave worse and their inference errors show large standard 
deviations. However, in the paramagnetic phase, all the algorithms but ind give very small inference errors especially 
at small memory loads. Given a and N ^ we report the inference performance against temperature in Fig. [5] (d). In 
our numerical simulations, we find that in the low temperature region, some approximations, e.g., TAP, fail due to the 
high magnetizations (very close to 1 or —1) or extremely small correlations (~ ©(lO"**)) computed from GD. At the 
same time, the determinant of the correlation matrix C is nearly equal to zero. Therefore the algorithms are unable 



6 





FIG. 2: Inference performances for the fully-connected Hopfield network. The line linking each marker is a guide to the eye. 
The data marker is the average over 5 samples, (a) The inference error against the system size for the paramagnetic phase 
T = 1.5, a = 0.1. Results are obtained based on type A GD. The inset shows an enlarged view of performances for nMF, 
SM as well as TAP. (b) The inference error against the memory load with T = 0.6, = 100. Results are obtained based on 
type B GD. (c) The inference error against the memory load with T = 1.5, A'^ = 100. Results are obtained based on type A 
GD. The inset shows an enlarged view of performances for nMF, SM and TAP. (d) The inference error versus temperature for 
a = 0.03, N = 100. For the low temperature regime, type B GD is used to collect data. 



to extract the couplings. These results are not shown in Fig. [2] (d). For instance, we do observe the final retrieval of 
the stored pattern in a simulation with T = 0.4, TV = 100, P = 3, nevertheless, only nMF is successful but leads to a 
high inference error ~ 0.199 and all other methods fail. In this case, given many stored patterns, if the first pattern is 
assumed to be retrieved, i.e., the overlap Q'^i ~ 1' then the energy function equation ([1]) can be decomposed 

into two terms as 7i = — jTv ^mt^i (-Sj ^i^i)^ ~ JW (Si ^I'^i)'^ ^ ^'i^'i the second term is relevant thus Ti. = — J2i ^i'^i 

where hi = ^ Sj^^j which implies that in the recall phase, the information about couplings is lost and the local 
field contains only the information about the retrieved pattern. The same failure occurs also in the spin glass phase, 
for example, the best performance for one single sample is given by TAP, If. ~ 0.129 with T = 0.4, a = 0.13, N = 100. 
As shown in Fig.[2](d), the inference performance becomes worse as the temperature decreases. With many embedded 
patterns, we will lose the information about couplings even if the retrieval successes, as explained above. On the other 
hand, the free energy landscape becomes complex at low temperatures and it is difficult to detect the equilibrium 
properties of the system, i.e., the emergence of spurious minima will trap the Glauber dynamics and it requires much 
longer GD steps to reduce the noise of estimates of the magnetizations and correlations which affects the inference 
results a lot. For the current system, ergodicity breaking sets in at very low temperature where the replica symmetric 
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FIG. 3: (Color online) Inference performances for the finite connectivity Hopfield network. The line linking each marker is a 
guide to the eye. The data marker is the average over 5 samples. Two ratios a = 0.6, 1.4 are considered with the same mean 
degree / = 5, system size A*' = 100. For the low temperature regime, type B GD is used to collect data. 

assumption was shown to be invalid [28!|, then the magnetizations and correlations lose their physical meanings in 
that they are usually defined in a single state, therefore reconstruction algorithms capable of dealing with the case of 
multi-state are necessary. Finally, it should be mentioned that in the low temperature region, all mean field schemes 
show large sample-to-sample fluctuations and their predictions are unreliable. 

Inference performances of the reconstruction algorithms on the sparse Hopfield network are also considered and 
shown in Fig. [31 At the low temperature, some mean field schemes fail therefore the results are not presented. It 
should be noted that ind performs well to reconstruct the finite connectivity network, consistent with the fact that 
the independent pair approximation is reasonable in the sparse network. Particularly in the low a regime ind shows a 
very good performance, and at the low temperature, it even outperforms other existing mean field schemes although 
the inference errors are relatively large. In the high temperature region, all mean field schemes perform better with 
low a than high a. At temperatures larger than 1.0, SM does a better job than other algorithms, as also observed 
in the fully-connected case. When we decrease the temperature where the system stays, we are confronted with the 
failure of all mean field schemes as appears in the case of fully-connected network, however, when the algorithm works 
well, the predicted couplings between unconnected neurons are very small compared to those between neurons which 
are actually interconnected, therefore one can reconstruct the network using an appropriate cutoff. On the other 
hand, it shows clearly that the sample-to-sample fluctuations become larger and larger as the temperature decreases. 

V. CONCLUSIONS 

In this work, we have tested performances of four fast mean field schemes for inferring couplings of Hopfield networks. 
We find that nMF, SM and TAP show better performances than ind in paramagnetic phase, and their performances 
degrade with the increasing network size and memory load. The inference error achieved by SM shows large sample- 
to-sample fluctuations and becomes inferior to ind in a certain range of memory load as the system is presented in the 
low temperature region. When there is one single pattern stored in the network, all mean field schemes are able to 
do a good job, whereas, as many patterns are embedded, the free energy landscape becomes complex and even if one 
of embedded patterns is found, the information about couplings of the network is inevitably lost and the algorithms 
fail to reconstruct the network. Similarly, in the spin glass phase, due to the emerging spurious minima, it is difficult 
to detect the equilibrium states and the estimated magnetizations and correlations become unreliable. The message 
passing algorithms proposed in Ref. Q may offer hints for improving the performance of network reconstruction. 
Our work implies that as a direct problem, the recall phase is favored, while the unfavored paramagnetic phase is in 
turn most useful for the reconstruction of the network, which is particular for the Hopfield network where couplings 
are constructed according to the Hebb's rule and the stored patterns are in fact random and uncorrelated. For 
the sparse network, when we decrease the temperature and this is necessary for the network to recall one of stored 
patterns, all reconstruction algorithms deteriorate and ind shows a relatively better performance. Whether in the 
fully-connected case or in the sparse case, the performances achieved by the reconstruction algorithms show larger 
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and larger sample-to-sample fluctuations with decreasing temperatures and the predictions become no more accurate. 

The analysis of different approximations on inferring couplings of Hopfield network is kept to in our work, however, 
more analyses on the more realistic networks, such as neural networks gene regulatory networks and other 
biological networks ^1^, are urgently required. This subject of ongoing research will shed light not only on reasons 
why some algorithms fail to reconstruct the network but also on the further mechanisms we need to develop novel 
efficient network reconstruction algorithms for practical data analysis. 
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